EEdesign Home Register About EEdesign Feedback Contact Us The EET Network
eLibrary


 Online Editions
 EE TIMES
 EE TIMES ASIA
 EE TIMES CHINA
 EE TIMES KOREA
 EE TIMES TAIWAN
 EE TIMES UK

 Web Sites
CommsDesign
   GaAsNET.com
   iApplianceWeb.com
Microwave Engineering
EEdesign
   Deepchip.com
   Design & Reuse
Embedded.com
Elektronik i Norden
Planet Analog
Semiconductor
    Business News
The Work Circuit
TWC on Campus


ChipCenter
EBN
EBN China
Electronics Express
NetSeminar Services
QuestLink


August 8, 2002



Processor Fits Into Pin-Limited Design

By Charles Minter, Gary Neben & Titus Smith
Integrated System Design

March 1, 2002 (12:25 p.m. EST)

What do designers do when faced with the demanding requests of their marketing department? They innovate. We found that necessity is indeed the mother of invention as applied to microprocessor design. Our design team was able to pack more features into an embedded microprocessor while staying within the intended sales price for the IC. Our RISC marketing team proposed the product that eventually became the TMPR4925 embedded microprocessor based on the team's understanding of the digital consumer market and on its discussions with potential customers. The initial concept consisted of a list of functions and a sales price. When the function required a new design, most features were left up to the engineering team. The desired sales price governed package selection and had two major impacts on the design work: It set the maximum power dissipation to about 1 watt and the number of pins for the chip to 256.

We overcame those limitations by designing multi-purpose memory controllers that could handle a wide variety of memory types and by sharing pins among functions. This article will discuss our strategy and the trade-offs made in the specification, design and verification phases of the project.

The limit on power dissipation set a limit for the clock frequency for the peripherals. The CPU core has a maximum operational frequency of 200 MHz, and its external interface operates at a fraction of the CPU frequency. The supported clock ratios (CPU:interface) include 2:1, 2.5:1 and 3:1. Both the internal logic and the output buffers contribute to the total chip power dissipation. System considerations and design simplicity suggested that internal bus frequency match the external synchronous memory interface clock rate. Since the power dissipation limit prohibited the use of the 2:1 ratio, an internal bus frequency of 80 MHz was used.

Unfortunately, the pin count required by all the desired functions was much greater than the 256 pins available. Thus, one of the first engineering tasks was to define a way to share the pins of the package to maximize the utility of functions available simultaneously. There were several engineering issues involved with devising a pin-sharing plan. The timing requirements for some paths to the pins were very tight, so we avoided using those pins for any function except their primary one. We also generated custom JTAG boundary-scan logic to minimize the logic levels through which a signal would need to pass in order to reach the pin.

A most important issue was ground and power bounce. To minimize this we maximized the use of slew-rate-limited output buffers, but those slower-output buffers were not appropriate for any pin that could have a high-speed function assigned to it. The mix of high-speed and low-speed output buffers caused other design issues. We went through several iterations of pin-sharing and pin-location specifications (Fig.1).

Fig. 2 shows our solution. The engineering design team defined the detailed feature set of each of the performance-critical functions designed for the project. Other peripherals were standard intellectual-property components from other Toshiba design groups or from third-party IP vendors. We tried to include features that would make the chip easy to use in a wide range of applications. Some features were designed to minimize the number of external components; others were devised to reduce board design difficulties by easing timing requirements.

We weighed the potential benefit of each potential feature against the implementation cost. Considerations included the design difficulty, the design risk, the implementation difficulty and the verification effort. The design risk included both the risk that the new feature would not work correctly and the risk that adding the logic for the new feature would interfere with the operation of other features even when the new feature was not used. Implementation cost refers primarily to difficulty in meeting timing constraints during logic synthesis. Verification cost includes verification of the feature, in combination with many possibly related features, before the chip is fabricated and confirmation of operation on the real chip.

The following discussion of the memory controllers gives examples of the feature selection and design trade-offs we made.

The NAND flash controller, external bus controller and synchronous memory controller peripherals have a major impact on the performance, ease of use and range of applications of the chip. They control a wide variety of memory types and support many options for configuration and timing of the memory. The memory controllers share the external data bus and its control logic. To minimize the loading on that high-speed bus presented by slow-speed memories, the bus can be split into a fast bus directly connected to the MPU and a slow bus separated by a buffer (e.g., 74x245 or QuickSwitch). All synchronous memory must be on the fast bus, but all other types of memory can be placed on the fast bus or the slow bus, if it exists.

The NAND flash controller was included to provide very low-cost nonvolatile storage. This type of memory is not randomly accessible and is slower than most other memory types. The controller could be lower-performance. It was derived from another Toshiba product with changes to the bus interface and the elimination of unneeded features. Because the slow memory probably won't be used in all applications, we were able to share the pins used for the NAND flash interface with other functions. To avoid the requirement to have two types of nonvolatile memory in a system, we provided a mechanism to boot from NAND flash.

The external bus controller (EBUSC) handles a wide range of asynchronous memory types with different timing requirements. One instance of the control logic is shared by all eight EBUSC channels. At the beginning of each cycle, the parameters for the selected channels are fed to the control logic, which then executes the cycle based on those parameters. The EBUSC can generate single external memory accesses as short as 37.5 nanoseconds and as long as 4.5 microseconds.

Some memory devices require setup and hold times between some of the address, control and data signals. To allow for that and to limit the complexity and number of control registers in the design, we used a global approach for setting the time between the appropriate signals. A single field, called the setup/hold wait time, is used to set the time between signal transitions. The extra time required in a cycle by using the same time for all setup and hold times should not be a problem, since this feature is used typically for slow devices. We made an additional compromise in the feature set of the EBUSC to keep the control logic simple, since the setup/hold wait feature can't be used for burst or page-mode accesses.

A potential customer requested that we add a PCMCIA interface; we extended the top-level specification to include that requirement. The flexibility of the EBUSC made the addition easy, and no changes were required to the main state machine. We added some logic to generate some of the unique control signals used by the PCMCIA interface. The chip-select signal for channels 6 and 7 of the EBUSC never connect to a pin. If those channels are used for PCMCIA support, the remaining six channels can support other memories or devices.

The SDRAM controller supports a wide array of SDR SDRAMs (including 512-Mbit SDRAMs), SDRAM-based DIMMs (including registered DIMMs), and SyncFlash memories. We selected that set of memories because we thought that it would best address a range of applications. We deleted support for two memory types from the previous version of the controller. While the interface to flash DIMM modules (using asynchronous NOR flash memories) was similar to that of an SDRAM, there were enough differences to require an inelegantly fitting, separate block of logic in the controller.

The SyncFlash components are much faster, and the logic to support them fits much more cleanly in the controller. Therefore, we thought SyncFlash could replace flash DIMMs. We also dropped support for synchronous masked ROM (SMROM) devices. SyncFlash devices are superior because they are writable, are faster and support a burst size of two. That burst size can support both incrementing and decrementing bursts. We also added an option to boot from SyncFlash memory to reduce total system memory cost.

The SDRAM controller supports four channels of mixed SDRAM and SyncFlash memories. We compromised between flexibility and design size by using a single set of timing parameters for all devices and channels. We thought that there should be more control over the memories used in an embedded-application system than in open systems. Also, current memory devices are designed for use in systems with higher clock rates than our moderate 80 MHz. A control register for each channel specifies the type of memory, its organization and a few other control parameters.

The trade-offs in selecting features to include in the SDRAM controller were especially important since SDRAM will likely provide the fastest storage. We were able to reduce the latency of accesses to SDRAM by one clock cycle from the previous version by simplifying the logic used to start a cycle. Eliminating support for SMROM and Flash DIMM memory types made that simplification possible. Adding support for SyncFlash devices provided the impetus to allow holding pages open after the access that opened them.

The feature works well for SyncFlash devices, since there is no miss penalty beyond a normal access. The feature is also available for SDRAMs, but since there is a miss penalty for those devices, the expected performance benefit will be less than for SyncFlash. Furthermore, the feature is available for the 32-bit data width but not for the 16-bit width. That restriction allows the logic that starts an open-page cycle to be simpler, contributing to the elimination of a cycle at the beginning of all accesses. Meeting the timing requirements for SDRAM accesses will probably be one of the most difficult tasks for system designers. Three features of the SDRAM controller were designed to alleviate that task.

First, the address signals and the RAS, CAS, and WE have two clock periods of setup time instead of the normal one. The controller uses an SDRAM burst size of two; larger burst accesses to SDRAM are broken down into multiple two-word bursts. Thus, the long setup time was easy to implement for all but the first SDRAM access in a set. Providing an extra clock of setup time for the first access in a set will add one clock cycle to the access. To allow the user to make that trade-off, a bit in the timing register controls whether the logic adds the extra setup time for a set of accesses.

Second, the setup time for all data signals on write accesses can also be set to two clock cycles. Since the SDRAM burst size is two, one of the writes in the burst must be disabled using the DQM input to the SDRAM. The burst is then repeated with the other write disabled.

Third, the MPU has two read paths for data to follow from the external data bus to the internal one. For a small system with a lightly loaded data bus, the MPU can sample the input data with its internal clock directly. The other read path relaxes the timing requirements for SDRAM data by sampling it with a clock supplied by a clock feedback pin. The output of that extra pipeline register is then sampled by the internal clock. SDRAM read cycles require one extra clock when the pipeline path is used.

The verification (prefabrication) and confirmation (post-fabrication) of a chip like this are very difficult tasks. We built a simulation environment that modeled a TMPR4925-based system, including devices connected to all the controllers and interfaces to be checked. We performed a different level of checking on the peripherals. Peripherals used from other projects with confirmed operation did not require thorough verification. In these cases, we verified all the interconnections within the MPU and a few simple aspects of the peripheral's operation.

For the peripherals that were designed for this chip, we did a much more complete verification. The system architect, peripheral designer and tester all contributed tests to the final list of tests to be run for each peripheral. The test list included individual peripheral tests as well as full system tests that checked the interoperation of many of the peripherals in different modes. The number of different mode combinations in the system tests was huge and made an exhaustive test impossible, so we used our judgment to pick the most appropriate tests.

In addition to the directed tests, we developed a set of tests in which many of the modes and parameters of the test were chosen at random, and we ran those randomized tests for hundreds of hours on several workstations.

Models were not available to check all the features. In some cases, we developed our own models or monitors to check the operation or timing of external interfaces. We grouped much of the external test logic into a module called the test FPGA; that logic was designed using synthesizable RTL. We also designed an evaluation board to test the first samples of the chip. We included the actual devices for which we had used models in the simulation environment. The miscellaneous test logic that had been in the test FPGA in our simulation environment was implemented in a real FPGA in our test board.

Deriving the evaluation board from the simulation environment allowed us to port many of the verification tests to the evaluation board for chip confirmation. We tested many more mode combinations and ran many more iterations of randomized tests on the real system. In addition, the specialized interfaces that were included from previous designs were checked by an independent team using an independent test board.

---
Charles Minter, director of embedded RISC engineering at Toshiba America Electronic Components Inc. (Irvine, Calif.), holds a PhD in computer science from Yale University. Gary Neben, senior staff engineer, holds an MSEE degree from the Massachusetts Institute of Technology. Titus Smith, senior logic designer, earned an AASEE degree from the Denver Institute of Technology.

http://www.isdmag.com
Copyright © 2002 CMP Media LLC
3/1/02, Issue # 14153, page 10.




 



 

Newsletters!
The EE Times Network offers engineers newsletters for nearly any discipline. Interested in EDA, check out the EEdesign Newsletter. Want EE news, check out the EE Times Newsletter. Everything from Test and Measurement to embedded programming to comms and analog design is available from the EE Times Network. Sign up for FREE newsletters NOW!
Comms Conference Savings
Attend the Communications Design Conference for as little as $545! Registration packages starting as low as $545! A world-class advisory board and speaker panel with more than 175 faculty members from over 100 leading-companies will present over 60 sessions. Register before August 26 and save up to $400!
Supply Network Conference
To be held Sept. 17-19 at The Fairmont Hotel in San Jose, Calif., this conference will kick-off with a keynote address by Michael Marks, chairman and CEO of Flextronics Corp., and will feature other noted speakers from the electronics industry. The complete program is available online. Register now for the Supply Network Conference.
 

Home  |  Register  |  About  |  Feedback  |  Contact
Copyright © 2001 CMP Media, LLC
Terms and Conditions  |  Privacy Statement